Regularized Max Pooling for Image Categorization
نویسنده
چکیده
We propose Regularized Max Pooling (RMP) for image classification. RMP classifies an image (or image region) by extracting feature vectors at multiple subwindows at multiple locations and scales. Unlike Spatial Pyramid Matching where the subwindows are defined purely based on geometric correspondence, RMP accounts for the deformation of discriminative parts. The amount of deformation and the discriminative ability for multiple parts are jointly learned during training. An RMP model is a collection filters. Each filter is anchored to a specific image subwindow and associated with a set of deformation coefficients. The anchoring subwindows are predetermined at various locations and scales, while the filters and deformation coefficients are learnable parameters of the model. Fig. 1 shows a possible way to define subwindows. To classify a test image, RMP extracts feature vectors for all anchoring subwindows. The classification score of an image is the weighted sum of all filter responses. Each filter yields a set of filter responses, one for each level of deformation. The deformation coefficients are the weights for these filter responses. Given a set of images {Ii} n i=1 and labels {yi|yi ∈ {1,−1}} n i=1 , consider a particular set of geometrically defined subwindows which can encode semantic content of an image at different locations and scales (e.g., Fig 1). Let {I j}mj=1 denote the set of subwindows for image I. Let φ be the feature function of which the input is an image region and the output is a column vector. Let D j be the feature matrix computed at location j for all images and K j the corresponding kernel, i.e., D j = [φ(I j 1) · · ·φ(I j n)] and K j = (D ) D j . The joint kernel for all subwindows is the sum of all kernels: K = ∑j=1 K ; this corresponds to concatenating all feature vectors computed at all subwindows. Given the kernel K, we train an Least-Squares SVM and obtain a coefficient vector and bias term α ,b. The filter for subwindow j can be computed as w j = D α . For a particular subwindow j and an image I, the regularized maximum score is defined:
منابع مشابه
Minh Hoai: Regularizedmax Pooling for Image Categorization
We propose Regularized Max Pooling (RMP) for image classification. RMP classifies an image (or an image region) by extracting feature vectors at multiple subwindows at multiple locations and scales. Unlike Spatial Pyramid Matching where the subwindows are defined purely based on geometric correspondence, RMP accounts for the deformation of discriminative parts. The amount of deformation and the...
متن کاملEmergence of Selective Invariance in Hierarchical Feed Forward Networks
Many theories have emerged which investigate how invariance is generated in hierarchical networks through simple schemes such as max and mean pooling. The restriction to max/mean pooling in theoretical and empirical studies has diverted attention away from a more general way of generating invariance to nuisance transformations. In this exploratory study, we study the conjecture that hierarchica...
متن کاملJoint Dictionary and Classifier Learning for Categorization of Images Using a Max-margin Framework
The Bag-of-Visual-Words (BoVW) model is a popular approach for visual recognition. Used successfully in many different tasks, simplicity and good performance are the main reasons for its popularity. The central aspect of this model, the visual dictionary, is used to build mid-level representations based on low level image descriptors. Classifiers are then trained using these mid-level represent...
متن کاملMultiple spatial pooling for visual object recognition
Global spatial structure is an important factor for visual object recognition but has not attracted sufficient attention in recent studies. Especially, the problems of features' ambiguity and sensitivity to location change in the image space are not yet well solved. In this paper, we propose multiple spatial pooling (MSP) to address these problems. MSP models global spatial structure with multi...
متن کاملEfficient Multiclass Implementations of L1-Regularized Maximum Entropy
This paper discusses the application of L1-regularized maximum entropy modeling or SL1-Max [9] to multiclass categorization problems. A new modification to the SL1-Max fast sequential learning algorithm is proposed to handle conditional distributions. Furthermore, unlike most previous studies, the present research goes beyond a single type of conditional distribution. It describes and compares ...
متن کامل